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I-INTRODUCTION: 

Robotic  manipulators  are  highly  non-linear,  dynamically  coupled,  multi¬ 
axis  electromechanical  systems  that  are  commonly  used  in  tasks  such  as 
welding,  material  handling  in  manufacturing,  and  as  accurate  positioning 
systems.  In  these  tasks,  the  end-effectors  are  required  to  move  from  one  place  to 
another,  or  to  follow  some  desired  trajectories  as  close  as  possible  at  a  fast 
operational  speed  in  a  free  workspace.  Trajectory  tracking  control  of  robots  is, 
thus,  of  practical  significance,  and  is  the  simplest  yet  the  most  fundamental  task 
in  robot  control  [1].  Concurrent  advances  in  microprocessor  technology  have 
made  the  implementation  of  complicated  non-linear  control  algorithms 
practically  feasible.  Robots  may  have  to  manipulate  loads  of  different  weight, 
size  and  mass  distributions,  in  which  case,  the  dynamics  of  the  robots  will  also 
be  different.  Controllers  designed  for  a  particular  nominal  payload  may  not  be 
able  to  control  the  system  properly  for  all  changes  in  the  parameters.  Some 
model  based  adaptive  controllers  have  been  proposed  based  on  linear  in  the 
parameters  models.  The  controllers  are  both  stable  and  robust  under  certain 
conditions.  The  system  can  achieve  asymptotic  tracking  without  using  infinite- 
gain  feedback  through  on-line  parameter  adaptation  when  the  system  is  subject 
to  parametric  uncertainties  only  [2].  However,  instability  may  occur  due  to 
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some  bounded  uncertain  disturbances  or  plant  non-linearity  [3],  making  zero 
tracking  accuracy  no  longer  guaranteed  since  the  steady  state  tracking  error  can 
only  be  shown  to  stay  within  an  unknown  ball  whose  size  depends  on  the 
disturbances.  Recently,  suggestions  have  been  made  to  use  neural  networks  to 
overcome  the  need  for  linear  parameterization  models.  In  many  schemes,  neural 
networks  are  firstly  "trained"  off-line  to  obtain  sufficiently  accurate 
approximations  of  the  input-output  functions  of  the  systems,  and  the  networks 
can  be  appropriately  used  to  construct  controllers.  Other  attempts  were  made  to 
train  the  neural  network  on-line  during  normal  system  operations  [4,5].  It  is 
known  that  neural  networks  have  the  static  mapping  capability.  However,  when 
they  are  used  as  controllers,  they  are  able  to  realize  the  dynamics  [6].  Despite 
the  fact  that  neural  networks  are  very  powerful  in  learning  complicated 
dynamics,  the  sizes  of  the  networks  are  known  to  be  very  large,  especially  the 
dynamic  ones,  which  subsequently  leads  to  the  need  of  powerful  computational 
facilities.  Recent  research  [7]  on  parsimonious  construction  algorithm  for  linear 
in  the  parameters  neural  networks  overcomes  the  curse  of  dimensionality 
associated  with  dynamic  neural  networks.  As  this  is  a  non-linear  optimization 
scheme,  it  can  only  be  achieved  off-line. 

The  goal  of  the  present  work  is  the  design  of  three  types  of  Multi¬ 
layered  neural  network  controllers  applied  to  control  a  dynamic  2D-Robot 
manipulator.  During  this  design,  different  learning  techniques  will  be  used  for 
temporal  process,  and  a  study  of  the  behavior  of  the  system  will  be  investigated. 
The  proposed  architectures  for  training  different  neural  network  controllers  are 
used  to  provide  the  appropriate  inputs  to  the  2D-Robot  arm  so  that  the  desired 
response  can  be  obtained.  The  first  architecture  is  a  neural  network  controller 
that  anticipate  the  output  of  a  PD  controller  and  which  weights  are  tuned  due  to 
the  Torque  control.  The  second  architecture  is  the  same  as  the  first  one  but  in 
which  the  weights  of  the  neural  network  are  tuned  by  minimizing  the  output  of  a 
PD  controller.  The  third  architecture  will  use  an  artificial  neural  network 
controller  that  anticipates  the  desired  input  of  the  closed  loop  system  consisting 
of  a  PD  controller  in  cascade  with  a  2D-Robot  arm.  The  search  of  optimum 
architectures  of  the  controllers  will  be  carried  out.  The  performances  of  the 
three  controllers  with  different  trajectories  will  be  examined.  Finally,  a 
comparative  study  with  respect  to  desired  performances  will  be  investigated. 

2  -  2D  DYNAMIC  PLANAR  ARM; 

2-1  -  Parameters  of  the  planar  arm; 

Our  focus  in  this  section  is  to  provide  the  design  fundamentals  for  neural 
networks  controllers  used  in  robotics  systems.  Trajectory  control  of  robotic 
manipulators  traditionally  consists  of  following  a  preprogrammed  sequence  of 
end  effector  movements.  Robot  control  usually  requires  control  signals  applied 
at  the  joints  of  the  robot  while  the  desired  trajectory,  or  sequence  of  arm  end 
positions,  is  specified  for  the  end  effector. 
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End  effector  point 
(x,y) 


qj,  (joint  angles) 


Fig.  1:  -  Dynamic  planar  robot  arm. 


Figure  1  shows  a  planar  manipulator  with  two  re  volute  joints.  Let  us  fix 
notation  as  follows: 

For  i  =  1,2 

qi  denotes  the  joint  angle,  which  also  serves  as  a  generalized  coordinate, 
m,  (kg)  denotes  the  mass  of  link  i  (rni=9.5\  m2=5). 
him)  denotes  the  length  of  link  i  (l]=0.25;  12=0.16). 
lei  (m)  denotes  the  distance  from  the  previous  joint  to  the  center  of  mass 
of  hnk  i  (lci=0.2;  lc2=0.14). 

7,  (kgm  )  denotes  the  moment  of  inertia  of  link  i  about  an  axis  coming 
out  of  the  page  and  passing  through  the  center  of  mass  of  link  i 
{Ii=4.3xl0-^\  12=6.1x10^). 

2-2  -  Equation  of  motion: 

In  the  case  of  the  planar  robotic  manipulators,  two  conditions  are  satisfied: 
First,  the  kinetic  energy  is  a  quadratic  function  of  the  vector  q  of  the  form: 

Where  the  n  x  n  "inertia  matrix"  D  is  symmetric  and  positive  definite  for 
eachq . 

Second,  the  potential  energy  V  =  V  (q )  is  independent  of  q  . 

By  applying  the  Euler-Lagrange  equations  for  such  a  system,  we  obtain: 

Ld]q(q)qj +  Lcijk(q)qiqj +^k(q)  =Xk,  k  =  l,...,n  (2) 

j  ij 

In  the  above  equation,  there  are  three  types  of  terms.  The  first  involve  the 
second  derivative  of  the  generalized  coordinates.  The  second  are  quadratic 
terms  in  the  first  derivatives  of  q ,  where  the  coefficients  may  depend  onq . 

These  are  classified  into  two  types.  The  Centrifugal  terms  involve  a  product  of 
2 

the  typeqj  ,  while  the  Coriolis  terms  involve  a  product  of  the  type  q^qj  where 
The  third  type  of  terms  are  those  involving  only  q  but  not  its  derivatives. 
Clearly  this  third  term  arise  from  differentiating  the  potential  energy.  It  is 
common  to  write  this  equation  in  matrix  form  as: 

D(q)q  +  C(q,q)q  +  g(q)  =  x  (3) 
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By  applying  the  mechanical  equation  of  motion  to  the  manipulator,  we 
obtain: 

u-Bq-x  =  Jq  (4) 

Where  B  is  a  friction  matrix,  J  is  a  diagonal  inertia  matrix  and  u  is  the 
applied  input  torque  to  the  planar  manipulator. 

Combining  equations  (3)  and  (4)  yields: 


M(q)q  +  h(4q)  +  g(q)  =  u  (5) 

Where: 

*  q,  q,  q  are  the  joint  angle,  joint  velocity,  and  joint  acceleration  respectively. 
*M(q)  defines  a  2x2  mass  matrix  that  describes  the  inertial  properties  of  the 

arm. 

*  h(q,q)  =  (C(q,q)+ B)q  defines  a  vector  that  describes  Coriolis  and 

Centripetal  acceleration,  and  friction  terms  in  the  dynamics  of  the  robot 
manipulator. 

*  g(q)  is  a  vector  that  specifies  the  effects  of  gravitational  forces  acting  on  the 
arm. 

*  u  represents  the  input  vector  torque  that  act  on  the  manipulator. 

2-3  -  Robot  dynamic  modeling; 

We  will  make  effective  use  of  the  Jacobian  expressions  in  computing  the 
kinematic  energy.  Since  we  are  using  joint  variables  as  the  generalized 
coordinates,  it  follows  that: 

The  expression  of  the  velocity  of  the  center  of  mass  of  link  one  is: 

(6) 

Where  represent  the  Jacobian  of  link  one  whose  expression  is  give  by: 


Vcl 


^cicosqi  0 
0  0 


(7) 


Similarly,  the  expression  of  the  velocity  of  the  center  of  mass  of  link  two  is: 

Vc2=Jvc2q 

Where  J^^^  represent  the  Jacobian  of  link  two  whose  expression  is  give  by: 


Vc2 


-^jsinqi  -£c2sin(qi  +  q2)  -^c2  sm(qi  +q2) 
^icosqi  +.£c2COs(qi  +q2)  .£c2  cos(qi  +  q2) 

0  0 


(9) 
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Hence  the  translational  part  of  the  kinetic  energy  is: 


-miVciVci  +  — m2Vc2Vc2 


+  m2J- 


c2 


Vc2 


}5 


(10) 


Next  we  deal  with  the  angular  velocity  terms.  Because  of  the  particularly 
simple  nature  of  this  manipulator,  many  of  the  potential  difficulties  do  not  arise. 
First,  it  is  clear  that: 


&i  =qik,  &2  =(qi  +q2)k 


(11) 


When  expressed  in  the  base  inertial  frame. 

The  z-axes  of  all  of  these  frames  are  in  the  same  direction,  so  the  above 
expression  is  also  valid  in  the  link-bound  frame.  Moreover,  since  COj  is  aligned 
-  -  T  - 

with  k ,  the  triple  product  cOj  IjCOj  reduces  simply  to  (133)1  times  the  square  of  the 

magnitude  of  the  angular  velocity.  This  quantity  (133)1  is  what  we  have  labeled  k 
above.  Hence  the  rotational  kinetic  energy  of  the  overall  system  is: 


1  0 
0  0 


(12) 


Finally,  the  expressions  of  the  different  elements  of  the  inertia  matrix  D  are: 

dll  +^2(^1  +^c2  +2f'i^c2^0®q2)+ll  +  l2 

^12  =^21  =  ni2(^c2  +^l^c2  COSq2)  +I2  (13) 

^22  =  1^2-^c2  +I2 


The  Christoff  el  symbols  are  defined  by  the  expressions: 


2aqi 

C121  =C2ll  =-m2^1^c2sin  q2 
C221  =-m2^1^c2sinq2 
C112  =m2^1^c2sin  q2 
C122  =C212  =^222  =0 


(14) 


The  potential  energy  of  the  manipulator  is  just  the  sum  of  those  of  the  two 
links.  For  each  link,  the  potential  energy  is  just  its  mass  multiplied  by  the 
gravitational  acceleration  and  the  height  of  its  center  of  mass.  Hence,  the 
functions  (|)k  become: 

4)1  =(mi£cl  +m2^i)gcosqi  +m2^c2gcos(qi  +q2) 

4'2  =m2^c2gCOS(qi+q2) 


Neural  Networks  for  Robot  Control  -  Final  Report  -  Prof.  Chaiban  NASR. 


Page  5/18 


Finally,  the  dynamical  equations  of  the  system  are: 


diiqi  +di2q2  +ci2i4iq2  +C2ii424i  +C22i42  +4i  - '^i 
d2l4l +^2242  +^1124? +42  = '^2  (16) 


The  robot  model  is  simulated  numerically  using  the  fourth-order  Runge- 
Kutta  method  for  the  solution  of  differential  equations.  Note  that  the  robot 
dynamics  equation  (16)  must  first  be  rearranged  into  standard  form: 

4  =  M-l(q)(u-h(q,4)-g(q))  (17) 

Implementation  of  the  numerical  algorithm  is  then  straightforward. 

3  -  Multi-Layer  Perceptrons  Controllers: 

3-1-  Multi-layer  perceptrons  (MLP); 

Multi-layer  perceptrons  [8]  provide  one  arrangement  for  neural  network 
implementation,  by  means  of  nonlinear  relationships  between,  firstly,  the 
network  inputs  to  outputs  and,  secondly,  the  network  parameters  to  outputs. 
Such  a  network  consists  of  a  number  of  neuron  layers,  n,  linking  its  input 
vector,  u,  to  its  output  vector,  y,  by  means  of  the  equation: 

y  =  (pn(Wn(pn-l(Wn-l...(pl(WiU-l-bi)  -t. .  .-t  bn-l)  +  bn)  (18) 

In  which  Wi  is  the  weight  matrix  associated  with  the  layer,  tpi  is  a 
nonlinear  operator  associated  with  the  ith  layer  and  h  indicate  threshold  or  bias 
values  associated  with  each  node  in  the  ith  layer.  The  function  cpi  is  a  sigmoid 
function  for  all  n. 

It  is  known  in  reality  that  real  neurons,  located  in  different  areas  of  the 
nervous  system,  have  different  modes  of  behavior  [9]  ranging  from  Gaussian- 
like  for  visual  needs  to  sigmoid  for  ocular  motor  needs.  It  is  generally  the  case 
for  artificial  neural  networks,  however,  that  only  one  type  of  non-linearity  is 
employed  for  a  particular  network,  this  linking  in  closely  with  the  fact  that  each 
network  is  only  employed  for  one  particular  task. 


9, 


' -  92 


yi 


72 


M  9: 


Fig:2  :  -  A  two-layer  multi-layer  perceptron. 
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Figure  2  shows  a  fully  connected  network  in  that  all  of  the  neuron  outputs 
in  one  layer  of  the  network  are  connected  as  inputs  to  the  next  layer.  This  is 
normal  practice:  however,  it  is  quite  possible  for  part-connectivity  to  be  realized 
by  connecting  a  group  of  outputs  to  only  specific  input.  By  this  means  sub¬ 
models  can  be  formed  within  the  overall  MLP,  and  these  can  be  particularly 
useful  where  a  specific  system  characteristic  is  to  be  dealt  with  or  inputs/outputs 
from  the  system  can  be  categorized  into  certain  types. 

Whatever  the  network  connectivity,  key  questions  in  the  use  of  MLPs  are 
how  many  layers  there  should  be  and  how  many  neurons  there  should  be  in  each 
layer.  Once  these  structural  features  of  the  network  have  been  selected,  it 
remains  for  the  adjustable  weights  to  be  settled  on  such  that  the  network  is 
completely  specified  in  terms  of  its  functionality. 

3-2  -  Dynamic  back  propagation  learning; 

By  far  the  most  popular  method  employed  for  weight  training  in  MLP 
neural  networks  is  called  back  propagation  [10].  In  the  standard  feed- forward 
MLP  network,  back  propagation  solves  the  problem  of  missing  information  to 
the  hidden  layers,  neither  the  input  to  nor  the  reference  signals  for  the  hidden 
layers  are  known.  This  solving  problem  is  obtained  by  taking  the  inputs  to  the 
hidden  layers  as  being  the  inputs  to  the  first  layers  propagated  through  the 
network.  The  reference  signals  for  the  hidden  layers  are  then  obtained  by  error 
back  propagation  through  the  network.  This  is  realized  by  obtaining  the  partial 
derivative  of  the  squared  error  with  respect  to  the  parameters. 

It  is  worth  pointing  out  that  the  back  propagation  algorithm  has  also  been 
used  for  weight  learning  in  feedback  neural  networks  [11,12],  these  being 
networks  in  which  the  network  structure  incorporates  feedback,  whereby  the 
output  of  every  neuron  is  fed  back,  in  weighted  form,  to  the  input  of  every 
neuron.  The  architecture  of  such  a  network  is  inherently  dynamic  and  realizes 
powerful  capabilities  due  to  its  complexity. 

Consider,  as  a  starting  point,  a  single  neuron  with  output  yi;  then 


Yi  =tp(Xi) 


l-exp(-x.) 

l-i-exp(-Xj) 


In  which: 


WjjUj  -l-w 


0 


(19) 


(20) 


In  this  expression  wo  is  a  bias  term.  If  it  is  assumed  that  at  an  instant  in 
time,  for  an  input  q  the  output  y  should  be  equal  to  the  desired  output  yd,  then 
the  squared  error  of  the  output  signal  is  given  by: 

E,  =i(y,-yir  =  L?  (21) 

and  it  is  desired  to  minimize  Ej  by  means  of  a  suitable/best  choice  of  the 
weighting  coefficients  Wy  . 
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Consider  the  problem  of  minimizing  the  scalar  error  function  E(W),  where 
W  is  a  vector  of  weights  to  be  adjusted  by  means  of  an  interactive  procedure 
generating  a  number  of  search  points,  W(k),  such  that: 

W(k+1)  =  a(k)W(k)  +  ri(k)d(k)  (22) 

In  this  relation  an  initial  set  of  weightings  W(0)  is  made  through  prior 
knowledge,  by  a  reasoned  guess  or  even  relatively  randomly.  The  term 
a(k)W(k)  represent  a  momentum  term  and  a(k)  is  usually  a  positive  number 
called  the  momentum  constant.  The  term  d(k)  indicates  the  search  direction, 
whereas  ri(k)  indicates  the  length  of  search  step  or  the  amount  of  learning  to  be 
carried  out. 

In  this  way,  the  weights  associated  with  one  neuron  can  be  adjusted  in 
order  to  minimize  the  squared  error.  The  approach  can  then  be  extended  in  order 
to  adjust  all  of  the  weights  in  the  MLP  network.  So,  overall,  a  set  of 
input/desired  output  data  values  are  used  to  train  the  entire  network.  The  input 
set  also  realizing  a  corresponding  set  of  network  weights  such  that  the  error 
between  the  desired  output  signals  and  the  actual  network  output  signals  is 
minimized  in  terms  of  the  average  overall  learning  points.  The  back  propagation 
algorithm  employs  the  steepest-descent  method  to  arrive  at  a  minimum  of  the 
mean  squared  error  function.  For  one  specific  data  pair,  the  error  squared  can  be 
written  as: 


1  ^  1  ni 

Ek  =-li(ydk-yik)"  (23) 

^  i=l  ^  i=l 

where  m  neurons  are  assumed  to  be  present,  and  yk  is  the  ith  neuron’s  kth 
output  value. 

The  global  error  is  then  found  by  minimizing  F  over  all  the  data  set.  If 
the  number  of  data  values  that  are  present  is  N,  then  we  have: 

E  =  XEk  (24) 

k=l 


This  error  function  can  then  be  minimized  in  batch  mode  or  recursively  in 
an  on-line  manner. 

The  network  is  now  fully  trained  on  the  data  presented  and  can  be 
employed  with  any  further  data,  although  it  may  be  desirable  to  present  the  data 
again  cyclically  until  the  overall  error  falls  below  a  previously  defined 
minimum  value,  i.e.  until  the  weights  converge.  An  important  feature  then 
emerges  in  that  the  MLP  network  has  the  ability  to  generalize  when  it  is 
presented  with  new  data  not  previously  dealt  with. 

3-3  -  NN  controller  tuned  by  the  torque  control  (First  Method): 

Before  describing  the  neural  network  systems,  we  detail  the  training 
procedure  used  to  provide  error  information  to  the  systems. 
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Fig^  :  -  Feed-forward  controller  trained  by  minimizing  the  torque  error 


The  procedure  involves  training  an  adaptive  feed-forward  controller  to 
control  the  arm  in  conjunction  with  a  fixed  feedback  controller  (Figure  3).  The 
feedback  controller  aids  in  generating  training  data  that  the  forward  controller 
uses  to  learn  a  model  of  the  arm's  inverse  dynamics.  This  direct  approach  to 
training  a  neural  network  controller  has  been  studied  [13].  In  order  to  achieve  a 
desired  accelerationq ,  an  appropriate  torque  x(t)  must  be  applied  to  the  arm.  The 
relationship  between  acceleration  and  torque  is  the  inverse  dynamics  of  the  arm 
and  the  goal  of  the  learning  procedure  is  to  train  a  feed-forward  controller  to 
model  this  relationship.  The  feed-forward  controller  is  trained  on-line  in  the 
following  manner.  At  each  time  step,  the  control  signal  is  obtained  by  summing 
the  outputs  of  the  feed-forward  controller  and  the  feedback  controller.  The 
inputs  to  the  feed-forward  controller  are  the  desired  joint  positions,  velocities, 
and  accelerations  for  the  current  time  step,  as  specified  by  the  reference 
trajectory.  The  inputs  to  the  feedback  controller  (a  PD  controller)  are  the  desired 
and  actual  joint  positions  and  velocities.  The  sum  of  the  feed-forward  and 
feedback  control  signals  is  a  torque  vector  that  is  applied  to  the  arm  (figure  3, 
straight  line).  The  resulting  joint  accelerations  are  observed  and  the  feed¬ 
forward  controller  then  receives  the  actual  joint  positions,  velocities,  and 
accelerations  as  inputs  and  computes  new  outputs  (figure  3,  dashed  line).  An 
error  is  computed  between  this  output  and  the  actual  torque  applied  to  the  arm, 
and  the  error  is  used  to  change  the  weights  in  the  controller  using  back- 
propagation  algorithm  [14].  Early  in  the  training  session,  the  feedback  controller 
dominates  and  the  arm  follows  the  desired  trajectory  imprecisely.  As  the  feed¬ 
forward  controller  learns  to  model  the  arm's  inverse  dynamics,  it  begins  to 
generate  torque  that  allows  the  arm  to  follow  the  desired  trajectory  more 
faithfully. 

The  neural  network  used  is  a  multi-layer  perceptron  with  one  hidden 
layer.  The  input  vector  contains  nine  components  that  are  real  valued  and 
represent  the  robot  arm's  joint  positions,  velocities,  and  accelerations. 

qi ,  q2 ,  q  iCOs(q2),  q2COs(q 2),  q  f sin(  q  2) 
q2sin(q2),  qiq2sm(q2),  cos(qi),  cos(qi+q2)  (25) 
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Those  components  (25)  are  the  transformations  of  the  joint  variables  that  are 
conveniently  chosen  so  as  to  facilitate  the  learning  process  of  the  robot  arm's 
inverse  dynamics  [14].  The  output  layer  has  two  neurons,  and  the  number  of 
hidden  neurons  in  the  hidden  layer  is  chosen  by  simulation.  The  activation 
function  used  is  a  sigmoid  that  uses  the  hyperbolic  tangent  function. 

3-4- NN  controller  tuned  by  minimizing  a  torque  control 
(Second  Method); 

This  second  method  [13,15]  is  represented  in  figure  4. 


Fig^  ^  -  Feedback  Error  Learning  Control  Structure. 


The  technique  presented  here  employs  a  standard  feed-forward  neural 
network.  The  neural  network  output  is  and  the  control  law  of  the  arm  is  given 
by  the  expression: 


U  =  X,+Xn 

(26) 

Where:  Xc=ky8-i-kp8  and  8  =  q(j -q 

(27) 

Combining  the  robot  dynamic  equation  (5)  and  the  control  law  (26),  the  error 

dynamic  equation  can  be  written  as: 

k  -b  k  p8 -b  Xn  =  Mq -b  h(4  q) -b  g(q) 

(28) 

The  training  signal  of  the  neural  network  is  the  output  x^, 
Erom  equation  (28)  the  expression  of  the  output  x^,  will  be: 

of  the  PD  controller. 

Xc  =  Mq  -b  h(4  q)  -b  g(q)  -  X^ 

(29) 

1  T 

The  neural  network  is  trained  to  minimize  E=— .  Thus,  at  convergence 

(x^.  =0),  the  ideal  output  of  the  neural  network  will  represents  the  robot  inverse 

dynamics.  The  neural  network  used  is  a  multi-layer  perceptron  with  one  hidden 
layer.  The  input  vector  contains  four  components  qj,q2,qi,q2  that  are  real 

valued  and  represent  the  robot  arm's  joint  positions,  and  velocities.  The  output 
layer  has  two  neurons,  and  the  number  of  hidden  neurons  in  the  hidden  layer  is 
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chosen  by  simulation.  The  activation  function  used  is  a  sigmoid  that  uses  the 
hyperbolic  tangent  function. 

3-5  -  NN  controller  that  anticipates  the  desired  input  (Third  Method); 

The  third  architecture  uses  an  artificial  neural  network  controller  that 
anticipates  the  desired  input  of  the  closed  loop  system  consisting  of  a  PD 
controller  in  cascade  with  the  robot  arm  as  shown  in  figure  5. 


Fig^  :  -  Neural  Network  that  anticipates  the  desired  input. 

In  this  structure  the  neural  network  controller  is  used  to  modify  the 
desired  trajectory  (qdi,  qcu)  instead  of  generating  a  compensating  torque  as  in  the 
two  previous  methods.  It  has  been  shown  [16]  that  the  trajectory  modification 
approach  for  compensating  robot  model  uncertainties  in  Cartesian  space  motion 
control  is  as  effective  as  compensating  the  control  torque  using  neural  network. 
The  purpose  here  is  to  develop  the  same  control  scheme  for  the  non-model 
based  PD  control  problem.  Let  (|)p  be  the  neural  network  output.  The  closed  loop 


system  equation  is  given  by: 

kve-Fkpe  =  Mq-Fh(4q)-Fg(q)  (30) 

where  e  =  qj. -q  =  qd +^p -q=e-l-^p  (31) 

and  e  =  qr -q  =  qd+(^p-q  =  i  +  <i)p  (32) 

The  expression(|)p  can  be  estimated  from  the  finite  difference: 

ip  =  ((^p(t)-(^p(t-l))/T  (33) 

T  is  the  sampling  period. 

Substituting  the  error  and  its  derivative  into  equation  (30)  yields: 

k^i+kp8  =  Mq  +  h(q,q)  +  g(q)-kv(^p  -kp^p  (34) 


We  choose  the  training  signal  vas  v  =  kyE-l-kp8.  The  neural  network  output  is 
trained  to  satisfy  v  =  0 .  Thus  the  neural  network  output  combined  with  its 
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derivative  will  represent  the  inverse  dynamics  of  the  robot.  The  attractive 
feature  of  this  scheme  is  that  it  can  be  easily  implemented  in  the  trajectory 
planner  that  is  outside  the  closed  loop  system.  The  neural  network  is  trained  to 


minimize 


V  .  The  neural  network  used  is  a  multi-layer  perceptron  with 


one  hidden  layer.  The  input  vector  contains  four  components  qj,q2,qi,q2  that 
are  real  valued  and  represent  the  robot  arm's  joint  positions,  and  velocities.  The 
output  layer  has  two  neurons,  and  the  number  of  hidden  neurons  in  the  hidden 
layer  is  chosen  by  simulation.  The  activation  function  used  is  a  sigmoid  that 
uses  the  hyperbolic  tangent  function. 


4  -  NUMERICAL  SIMULATION  AND  COMPARISON: 


First,  we  simulate  the  robot  arm  without  using  neural  network  controller. 
We  have  considered  a  desired  trajectory  defined  by  the  equations: 

Yn,  =  0.2 -t  0.075sin(27tt/3)  and  Z„,  =  0.2 -i- 0.075cos(27tt/3)  (35) 


We  have  used  At  =  O.Olseconds  for  time  integration.  The  choice  of  the  2x2  gain 
matrices  used  in  the  PD  controller  was  to  obtain  a  stable  system  in  the  closed 
loop  system  with  bad  performances,  ky  =  lO.I  and  kp  =  lOO.I  where  I  represents 
the  identity  matrix  of  order  two.  Figure  8  (blue  line)  shows  the  simulated 
response  of  the  system  for  the  desired  trajectory.  Figures  9  &  10  (blue  line) 
show  the  responses  of  the  joints  angle  and  joints  velocity.  It  is  clear  that  the 
choices  of  those  values  of  gains  are  not  well  adapted  to  the  control  of  this 
dynamic  2-D  robot  arm. 

4-1  -  Numerical  simulation  of  the  First  Method; 


During  the  numerical  simulation  of  the  three  methods,  we  have  used  the 
back-propagation  method  with  the  first  order  convergence  algorithm  for  training 
the  neural  network  in  order  to  determine  the  synaptic  weights.  The  objective 
was  to  minimize  a  cumulative  error  E  defined  by: 

E  =  L[(yd(nAt) - y(nAt))^  -F (zd(nAt) -z(nAt))^]  (36) 

n=l 

where  N  =  t/At  denotes  the  sampling  number  within  one  trial.  yd(nAt),Zd(nAt)  are 
the  demands  of  the  trajectory  on  the  Y-Z  plane  at  the  sampling  time  nit,  and 
y(nAt),z(nAt)  are  the  actual  trajectories. 

In  the  first  method,  we  have  trained  a  number  of  neural  networks  using  the  error 

back-propagation  technique  with  different  learning  rates  and  different 

_'2 

momentum  terms.  Figure  7  shows  an  optimum  learning  rate  equal  to  3.10“  and 
an  optimum  momentum  term  equal  to  0.1.  Figure  6  shows  that  the  (9-18-2) 
multi-layer  perceptron  represents  the  optimal  architecture  with  minimum 
cumulative  error  equal  to  4.27x10'^  obtained  after  40  cycles.  Moreover,  figure  8 
(green  line)  shows  the  tracking  response  of  the  end  effector  to  the  desired 
trajectory,  and  figure  11  shows  the  evolution  of  the  instantaneous  square  error 
between  the  desired  and  actual  position.  This  method  presents  the  best 
convergence  in  comparison  with  the  other  methods. 
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4-2  -  Numerical  simulation  of  the  Second  Method: 


A  number  of  neural  network  eontrollers  have  been  trained  using  the  error 
baek-propagation  teehnique  with  different  learning  rates  and  different 
momentum  terms.  Figure  7  shows  an  optimum  learning  rate  equal  to  10"  and  an 
optimum  momentum  term  equal  to  0.1.  Figure  6  shows  that  the  (4-18-2)  multi¬ 
layer  pereeptron  represents  the  optimal  arehiteeture  with  minimum  eumulative 
error  equal  to  1.13x10'"^  obtained  after  40  eyeles.  Moreover,  figure  10  (green 
line)  shows  the  responses  of  the  joints  veloeities  to  the  desired  inputs,  and  figure 
11  shows  the  evolution  of  the  instantaneous  square  error  between  the  desired 
and  aetual  position. 

4-3  -  Numerical  simulation  of  the  Third  Method: 


This  new  seheme  is  different  from  the  two  other  approaehes  in  that  it 
modifies  the  desired  trajeetory  before  inputting  to  the  robot  PD  eontroller.  A 
number  of  neural  network  eontrollers  have  been  trained  using  the  error  baek- 
propagation  teehnique  with  different  learning  rates  and  different  momentum 
terms.  Figure  7  shows  an  optimum  learning  rate  equal  to  10'^  and  an  optimum 
momentum  term  equal  to  0.1.  Figure  6  shows  that  the  (4-6-2)  multi-layer 
pereeptron  represents  the  optimal  arehiteeture  with  minimum  error  equal  to 
1.3x10'^  obtained  after  40  eyeles.  Moreover,  figure  9  (green  line)  shows  the 
responses  of  the  joints  positions  to  the  desired  inputs,  and  figure  1 1  shows  the 
evolution  of  the  instantaneous  square  error  between  the  desired  and  aetual 
position. 

4-4  -  Comparison  between  the  three  methods: 

By  making  a  eomparison  between  the  three  methods  with  respeet  to  the 
eumulative  error,  figure  6  shows  that  the  first  method  is  the  best  in 
performanees.  Its  eumulative  error  is  2.5  times  less  than  the  seeond  method  and 
5  times  less  than  the  third  method.  But  the  third  method  needs,  in  its  optimal 
arehiteeture,  less  parameter  than  the  two  others  do.  While  the  first  method  needs 
218  parameters  and  the  seeond  method  128  parameters,  the  third  method  need 
only  44  parameters.  Figure  11  shows  the  instantaneous  temporal  square  error 
obtained  by  the  three  methods  between  the  desired  trajeetory  and  the  aetual 
response.  It  is  elear  that  the  first  method  is  the  best  in  aeeuraey. 

5  -  COMMENTS  AND  PERSPECTIVES: 


Applieations  in  new  teehnologies  sueh  as  roboties,  manufaeturing,  spaee 
teehnology,  and  medieal  instrumentation,  as  well  as  those  in  older  teehnologies 
sueh  as  proeess  eontrol  and  aireraft  eontrol,  are  ereating  a  wide  speetrum  of 
eontrol  problems  in  whieh  non-linearity,  uneertainties,  and  eomplexity  play  a 
major  role  [17].  For  the  solution  of  many  of  these  problems  teehniques  based  on 
artifieial  neural  network  are  beginning  to  eomplement  eonventional  eontrol 
teehniques  [18],  and  in  some  eases  they  are  emerging  as  the  only  viable 
alternatives.  From  a  eontrol  theoretie  point  of  view,  artifieial  neural  network 
may  be  eonsidered  as  traetable  parameterized  families  of  nonlinear  maps.  As 
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such  they  have  found  wide  applieation  in  pattern  reeognition  problems,  whieh 
require  nonlinear  deeision  surfaees.  With  the  introduetion  of  dynamies  and 
feedbaek,  the  seope  of  sueh  networks  as  identifiers  and  eontrollers  in  nonlinear 
dynamieal  systems  has  inereased  signifieantly. 

Our  works  aims  to  study  three  types  of  neural  network  eontrollers  for 
trajeetory  eontrol  of  robotie  manipulators,  and  to  investigate  the  performanees 
of  the  applieation  of  these  eontrollers  (multi-layer  perceptrons,  MLP)  for  elosed 
loop  2D  planar  robot  arm  by  using  the  baek-propagation  methods  for  the 
adjustment  of  parameters.  We  have  implemented  the  different  algorithms  and 
we  have  obtained  the  optimal  architeeture  in  the  three  different  methods.  Good 
results  where  obtained  through  the  eonvergenee  of  the  algorithms.  The  first 
method  presents  the  smallest  eumulative  error  in  eomparison  with  the  two  other 
methods  (figure  11)  and  the  best  response-time  for  the  joint  angle  and  veloeity. 
But  the  third  method  uses  less  input  eomponents  and  hidden  neurons  making  the 
real-time  proeess  faster.  Moreover,  this  third  method  presents  praetieal 
advantages  over  the  two  other  methods  in  that  eompensation  is  done  outside  the 
eontrol  loop  so  it  ean  be  implemented  easily  at  the  eommand  trajeetory  planning 
level  external  to  an  existing  robot  eontroller. 

Our  target  from  this  work  is  the  real-time  proeess  (implementation  on 
DSP  of  different  arehiteetures  in  neural  networks  and  studying  the  behavior  of 
all  the  proeess). 
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RESULTS  OF  THE  SIMULATIONS 


Neurons/ layer 

Fig.6:  -Optimized  architecture  of  the  neural  network  controller  for  the  three  methods 
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Fig.7:  -Optimized  learning  rate  and  Momentum  term  for  the  three  methods 
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FiS:8  -Simulated  trajectory  of  the  end-effector  in  the  yz  Cartesian  plane 


Fig:9  :  -Desired  and  actual  joints  position  responses 
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